When looking at the alpha diversity of the infected vs uninfected, the community doesnt seem to be different, these communities seem to follow a similar trend regardless of infection. But when looking at the colonization level within the infected, it appears to be a bimodal distributution.
To investigate whether the bimodal distribution of the colonization level vs alpha diversity, we can look at how the communities are distributed prior to infection. If this difference in distribution is related to the level of colonization, we would expect the distribution of end point colonization levels to be random at the time points prior to the infections. When we color the point by end point colonization, it appears to be lower diversity in only the highly colonized mice
So comparing the mice infected with C. difficile, is there a decrease in diversity in the mice that become highly colonized at the final time point? ## Do stats!
##
## Wilcoxon signed rank test with continuity correction
##
## data: value
## V = 1275, p-value = 7.79e-10
## alternative hypothesis: true location is not equal to 0
## function (save = "default", status = 0, runLast = TRUE)
## .Internal(quit(save, status, runLast))
## <bytecode: 0x7f958af1a800>
## <environment: namespace:base>
Are differences in endpoint diversity not due to colonization but actually abx?
Since the high and low colonization seem to be grouped by antibiotic, is the difference due to antibiotic or cdiff?
How do specific communities transition from abx to infection?
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.162 -2.412 -1.213 1.429 13.075
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.241e+00 1.714e-01 30.58 < 2e-16 ***
## CFU -1.455e-08 2.479e-09 -5.87 8.24e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.272 on 468 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.06859, Adjusted R-squared: 0.0666
## F-statistic: 34.46 on 1 and 468 DF, p-value: 8.244e-09
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -35.753 -15.274 -4.381 12.851 72.405
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.609e+01 1.047e+00 44.03 < 2e-16 ***
## CFU -1.104e-07 1.514e-08 -7.29 1.33e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.99 on 468 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.102, Adjusted R-squared: 0.1001
## F-statistic: 53.14 on 1 and 468 DF, p-value: 1.331e-12
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.65813 -0.51173 -0.06297 0.45801 2.09371
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.899e+00 3.404e-02 55.801 < 2e-16 ***
## CFU -3.098e-09 4.923e-10 -6.293 7.15e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6498 on 468 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.07803, Adjusted R-squared: 0.07606
## F-statistic: 39.61 on 1 and 468 DF, p-value: 7.149e-10
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9720 -1.5949 -0.8509 0.8313 14.2621
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.051e+00 1.503e-01 26.944 < 2e-16 ***
## CFU -6.395e-09 1.924e-09 -3.324 0.000978 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.435 on 366 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.0293, Adjusted R-squared: 0.02665
## F-statistic: 11.05 on 1 and 366 DF, p-value: 0.0009777
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.215 -12.009 -2.285 9.603 58.318
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.882e+01 9.625e-01 40.329 < 2e-16 ***
## CFU -6.051e-08 1.232e-08 -4.913 1.36e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.59 on 366 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.06187, Adjusted R-squared: 0.0593
## F-statistic: 24.14 on 1 and 366 DF, p-value: 1.356e-06
## [[1]]
## [[1]]
##
## Call:
## lm(formula = get(variable_name) ~ CFU, data = data_frame)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.43246 -0.41705 -0.08106 0.43371 1.47089
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.674e+00 3.495e-02 47.885 < 2e-16 ***
## CFU -1.551e-09 4.473e-10 -3.468 0.000588 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5662 on 366 degrees of freedom
## (36 observations deleted due to missingness)
## Multiple R-squared: 0.03181, Adjusted R-squared: 0.02916
## F-statistic: 12.02 on 1 and 366 DF, p-value: 0.0005875
After talking with Pat (1/17/18) Does end point look like pre_abx? are most similar the recovered ones? Split analysis by abx/dose/days recovered Resistance more similar than colonized? Start with high dose and 1 day recovery then look at how modulating the dose/recovery affects train model with low recovery and test with high recovery Show different context of day 0 compare differences in metro recovery how do susceptibility break points compare?
the metadata file has the following columns group CFU - ranges 0 to 8.1e8 with 601 NAs (most of NAs are on days when cdiff was not present, so can change to 0 expect for NAs after day 1) cage mouse day - ranges -11 to 10 abx - amp cef cipro clinda metro none strep vanc 405 379 83 190 339 3 362 312 dose - 0.1 0.3 0.5 0.625 1 10mg/kg 5 NA’s 304 253 653 112 339 273 136 3 dose abx cages mice
10 cipro
10 clinda
0.1 cef
0.3 cef 0.5 cef 1 metro 0.1 strep 0.5 strep 5 strep 0.1 vanc 0.3 vanc 0.625 vanc cdiff - if sample was treated challenged with C. difficile logical T(1770), F(303) delayed - if sample was allowed extra days to recover from abx treatment logical T(455), F(1618) preAbx - if sample collected prior to abx treatment logical T(154), F(1919) recovDays - how many days after stopping abx (metro and amp for 5 day recovery) range 1 to 5
only one mouse not given abx but is listed as preAbx F for -5 possible to denote mock abx treatment(?) question about mouse 600-2D-6 (cef - delivered via water) should be pre-antibiotic but is listed as F all other mice in cage are preAbx on day -6, except this one since this abx was delivered via drinking water, it is likely clerical error, need to write a check script to make sure all mice in each cage all are recorded to have the same treatment
## # A tibble: 100 x 6
## otu median_abundance rho pvalue pvalue_BH pvalue_bon
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Otu000003 1.26 -0.614 4.17e-50 7.76e-48 7.76e-48
## 2 Otu000064 0. -0.606 1.52e-48 1.41e-46 2.83e-46
## 3 Otu000070 0. -0.531 1.40e-35 6.88e-34 2.61e-33
## 4 Otu000006 0. -0.531 1.48e-35 6.88e-34 2.75e-33
## 5 Otu000041 0. -0.484 5.64e-29 2.10e-27 1.05e-26
## 6 Otu000010 1.65 0.482 1.05e-28 3.26e-27 1.96e-26
## 7 Otu000017 0.100 -0.470 3.47e-27 8.35e-26 6.46e-25
## 8 Otu000057 0. -0.470 3.59e-27 8.35e-26 6.68e-25
## 9 Otu000031 0. -0.459 6.39e-26 1.32e-24 1.19e-23
## 10 Otu000050 0. -0.453 3.83e-25 7.11e-24 7.11e-23
## # ... with 90 more rows